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Abstract 

Let X and y be finite alphabets and Pxy a joint distribution over them, with Px and 
Py representing the marginals. For any e > 0, the set of n-length sequences x n and y n that 
are jointly typical [?] according to Pxy can be represented on a bipartite graph. We present a 
formal definition of such a graph, known as a typicality graph, and study some of its properties. 



I. INTRODUCTION 

The concept of typicality and typical sequences is central to information theory. It 
has been used to develop computable performance limits for several communication 
problems. 

Consider a pair of correlated discrete memoryless information sources and Y 
characterized by a generic joint distribution p X y defined on the product of two finite 
sets X x y. An length n X-sequence x n is typical if the empirical histogram of x n is 
close to px- A pair of length n sequences (x n , y n ) G X n x y n is said to be jointly typical 

1 We use the following notation throughout this work. Script capitals U, X , y, Z,. . . denote finite, nonempty sets. 
To show the cardinality of a set X , we use \X\. We also use the letters P, Q,. . . for probability distributions on finite 
sets, and U, X, Y ,. . . for random variables. 
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if the empirical joint histogram of (x n , y n ) is close to the joint distribution p X y- The set 
of all jointly typical sequence pairs is called the typical set of p X y- 

Given a sequence length n, the typical set can be represented in terms of the following 
undirected, bipartite graph. The left vertices of the graph are all the typical X-sequences, 
and the right vertices are all the typical F-sequences. From well-known properties of 
typical sets, there are (approximately) 2 nH( - x ^ left vertices and 2 nH( - Y ^ right vertices. A 
left vertex is connected to a right vertex through an edge if the corresponding X and 
F-sequences are jointly typical. From the properties of joint typicality, we know that the 
number of edges in this graph is roughly 2 nH ( x ' Y \ Further, every left vertex (a typical 
X-sequence) has degree roughly equal to 2 nH ^ Y \ x \ i.e., it is jointly typical with 2 nH( - Y ^ 
F-sequences. Similarly, each right vertex has degree roughly equal to 2 nH ( x \ Y \ 

In this paper we formally characterize the typicality graph and look at some sub- 
graph containment problems. In particular, we answer three questions concerning the 
typicality graph: 

• When can we find subgraphs such that the left and right vertices of the subgraph 
have specified degrees, say R' x and R' Y , respectively ? 

• What is the maximum size of subgraphs that are complete, i.e., every left vertex is 
connected to every right vertex? One of the main contributions of this paper is a 
sharp answer to this question. 

• If we create a subgraph by randomly picking a specified number of left and right 
vertices, what is the probability that this subgraph has far fewer edges than ex- 
pected? 

These questions arise in a variety of multiuser communication problems. Transmitting 
correlated information over a multiple-access channel (MAC) [?], and communicating 
over a MAC with feedback [?] are two problems where the first question plays an 
important role. The techniques used to answer the second question have been used to 
develop tighter bounds on the error exponents of discrete memoryless multiple-access 
channels [?], [?], [?]. The third question arises in the context of transmitting correlated 
information over a broadcast channel [?]. Moreover, the evaluation of performance limits 
of a multiuser communication problem can be thought of as characterizing certain 
properties of typicality graphs of random variables associated with the problem. 

The paper is organized as follows. Some preliminaries are introduced in section II. 
In section III, the typicality graphs are formally defined and some properties about 
the number vertices, edges, and degree conditions are obtained. The main result of the 
paper which is obtained in section IV. 
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II. Preliminaries 



In this section, we provide a concise review of some of the results available in the 
literature on the typical sequences, 5-typical sets and their properties [?]. 
Definition 1: A sequence x n G X n is X-typical with constant 5 if 

1) \±N(a\x n ) -P x (a)\ < S, Va G X 

2) No a G X with Px(cl) = occurs in x n . 

The set of such sequences is denoted T^(P X ) or Tg(X), when the distribution being 
used is unambiguous. 

Definition 2: Given a conditional distribution Py\x, a sequence y n G y n is conditionally 
-Py|x _ typical with x n G X n with constant 5 if 

1) \±N(a,b\x n ,y n )-±N(a\x n )P Y \x(b\a)\<8 : \/aeX,bey. 

2) N(a,b\x n ,y n ) = whenever Py\ X (b\a) = 0. 

The set of such sequences is denoted T£(P Y \x\x n ) or Tg(Y\x n ), when the distribution 
being used is unambiguous. 

We will repeatedly use the following results, which we state below as facts: 

Fact 1 [?, Lemma 2.10]: (a) If x n G T?(X) and y n G T${Y \x n ), then (x n , y n ) G T£ +s , (X, Y) 

and|/"Gr ( " 5+w (F)i 

(b) If x n G T?(X) and {x n ,y n ) G T™(X,Y), then y n G T 5 n +£ (y|x"). 

Fact 2 [?, Lemma 2.13] i: There exists a sequence e n — >■ depending only on | <Y| and 
|3^| such that for every joint distribution P x ■ Py\x on X x y, 



-log\T n (X)\-H(X) 



n 



\og\T n (Y\x n )\- H(Y\X) 



< £ r 



< e n , Wx n G T n (X). 



(1) 



The next fact deals with the continuity of entropy with respect to probability distribu- 
tions. 

Fact 3 [?, Lemma 2.7] If P and Q are two distributions on X such that 



J2\P(x)-Q(x)\<e<l 



The typical sets are with respect to distributions Px,Py\x and Pxy, respectively. 

3 The constants of the typical sets for each n, when suppressed, are understood to be some S n with 5 n — > and 
y/n ■ 5 n — > oo (delta convention). 
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then 

\H(P)-H(Q)\<-e\og^ 

III. Typicality graphs 
Consider any joint distribution P x ■ Py\x on X x y. 

Definition 3: For any e ln , 5 2 n, A n -» 0, the sequence of typicality graphs G n (ei n , e 2n , A„) 
is defined as follows. For every n, G n is a bipartite graph, with its left vertices consisting 
of all x n G T™ n (X) and the right vertices consisting of all y n G T" n (Y). A vertex on the 
left (say x 11 ) is connected to a vertex on the right (say y n ) iff (x n ,y n ) G Tj^(X, F). 

Remark. Henceforth, we will assume that the sequences £i n ,e 2 n,An satisfy the 'delta 
convention' [?, Convention 2.11], i.e., 

£in — > 0, v 7 ^ ' £ in — > oo as n — )> oo 

with similar conditions for e 2n and A n as well. The delta convention ensures that the 
typical sets have 'large probability'. 

We will use the notation Vx(-),Vy(.) to denote the vertex sets of any bipartite graph. 
Some properties of the typicality graph: 

1) From Fact 2, we know that for any sequence of typicality graphs {G n (si n , e 2n , A n )}, 
the cardinality of the vertex sets satisfies 



-\og\V x (G n )\-H(X) 



n 



< £ 



-\og\V Y {G n )\-H{Y) 



n 



< Sn (2) 



for some sequence e n — > 0. 
2) The degree of each each vertex % e Vx(G n ) and j e VV(G n ) satisfies 

degree(a; n ) < 2 n ( H ( y l x ) +£ "), Va; n e V^(Gn); degree(y n ) < 2 n ( H ( x l Y ) +£ "), Vy n e W(C7 n ) 

(3) 

for some e n — > 0. 

Proo/- If x n G T e n in (X) and (a: n ,|/ n ) G T™ n (X, Y), then from Fact 1(b), y n G 
T £ n in+An (F|x ri ). From the second part of Fact 2, we know that there exists a sequence 
£ n — > such that 

|T £ " in+An (y|x")| < 2 "W^) +£ «) (4) 

From this we conclude that degree(x n ) < 2 n{ - H{ - Y \ x ^ +£n \\/x n G V^(G n ). An identical 

argument yields degree(y n ) < 2< H ^ x ^ +£n \ \/y n G VV(G„). 
Property 2 gives upper bounds on the degree of each vertex in the typicality graph. 
Since we have not imposed any relationships between the typicality constants ei n ,e 2n 
and A n , in general it cannot be said that the degree of every X-vertex (resp. F-vertex) 
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is close to 2 NH( - Y ^ X ^ (resp. 2 NH{ - X \ Y ">). However, such an assertion holds for almost every 
vertex in G n . Specifically, we can show that the above degree conditions hold for a 
subgraph with exponentially the same size as G n . 

Proposition 1: Every sequence of typicality graphs G n (ei n , e 2n , A n ) has a sequence of 
subgraphs A n (e ln ,E 2n , A n ) satisfying the following properties for some 8 n — > 0. 

1) The vertex set sizes |Vx(Ai)| an d |^V(Ai)|/ denoted Q\ and 6 Y , respectively, satisfy 



1 



n 



log0 x -H(X) 



n 



\o g 9 Y -H{Y) 



< 8 n Wn 



2) The degree of each X-vertex x n , denoted 9' n (x n ) satisfies 



- log 9' n {x n ) - H{Y\X) 



n 



<5 n Vx n eV x (A n ). 



3) The degree of each F-vertex y n , denoted 0' n (y n ), satisfies 



- log 9' n (y n ) - H(X\Y) 



n 



<5 n \/y n EV Y (A n ). 



Proof: The vertex sets V x (G n ) and V Y (G n ) are the £i n -typical and e 2 n-typical sets 
of Px and Py, respectively. To define the subgraphs A n/ we would like to choose the 
sequences with type Px and Py, respectively as the vertex sets of the subgraph, with an 
edge connecting two sequences if they have joint type Pxy- However, the values taken 
by the joint pmfs Pxy, Px, Py may be any real number between and 1, whereas the 
joint type of two n-sequences is always a rational number(with denominator n). So we 
choose the subgraph A n as follows: 

• For each n, approximate the values of P X y to rational numbers with denominator 

n to obtain pmf Pxy, respectively Clearly P X y is a valid joint type of length n and 

the maximum approximation error is bounded by K In fact, V(x, y), we have for 

all sufficiently large n: 



\P XY (x,y)- P X y{x,y)\ < - « —= < X n , 

n 



(5) 



where the last inequality follows from the delta convention. Using Fact 1, we also 
have 



\Px(x) - P x (x)\ < \y\ ■ - « -= < em (6) 

n \Jn 

\P Y (y) - P Y (y)\ < \X\ ■ - « -1= < £2n (7) 

n \Jn 

• The left vertex set of A n is Tq(P x ), i.e., the set of x n sequences with type P x . The 
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right vertex set of A n is Tq(P y )- the set of y n sequences with type P Y . A vertex in 
V x (A n ), say a n is connected to a vertex in V Y (A n ), say b n iff (a n , b n ) e T n (Px,y), i.e., 
(a n , 6 n ) have joint type P XY . 
From (O,© and 0, we have 

T ™(P x )cT« n (P x ), T^P Y )CT^(P Y ) and 
T "(Px,y) C T? (P x , r ). 



Hence A n is a subgraph of G n , as required. 
From [?, Lemma 2.3], we have 



-log |T^(P X ) | -H(P X ] 
n 



< Sin, 



-\0g\T-{Py)\~H{Py) 



n 



< 5 



2n 



Vn, 



(8) 



where 5 Vn = (n + l)~\ x \ and 5 2n = + Fact 3 establishes the continuity of entropy 

with respect to the probability distribution. Using Fact 3 along with (O,© and ©, we 
obtain 



n 



\og\TZ(P x )\-H{P x ) 



n 



l0g\TZ(Py)\-H(Py) 



< Son Vn, 



(9) 



where we have reused 5\ n , 5 2n with some abuse of notation. This proves the first prop- 
erty. 

We now note that x n e V x {A n ) = T n (P x ) and y n e T^{P Y]x \x n ) implies a){x n ,y n ) e 
T n (P Xi y) and b)y n e T"(P y ) = V Y (A n ) (Fact 1). This implies 



degree(x") > \T™{Py\x\x n )\, to" E V x {A n ). 
From [?, Lemma 2.5], we know that 

\Tq{P Y \x)\ > 2 n{H{pY \ x) - 5 ' in) 



(10) 



(11) 



where 5 3n = \X \\y\ los ^ +1 ) . i n the above, H{P Y \ x ) stands for H(Y\X) computed under 
the joint distribution P XY . Combining this with (fTO)), we get a lower bound on the degree 
of each x n e V x (A n ): 

degree(x n ) > 2 nmpY \ x) ~ S3n) (12) 
From (O and ©, one can deduce that Vx, y 

\Py\x(y\x) - Py\x{v\x)\ < In 
for some 7 n — > 0. Combining this with Fact 3, (O can be written as 

degree(x n ) > 2 niH{p ^ x) - S3n) , (13) 
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Further, © gives an upper bound on the degree of each vertex in G n . Hence we have 



- log 9' n {x n ) - H{Y\X) 



n 



< max(S 3n , e n ) Vx n E V x (A r , 



Similarly, we can bound the degree of each vertex in Vy(v4 n ) as 



-\o g e' n {y n )-H{X\Y) 



n 



< max(<5 4n ,£ n ) Vy n E V Y {A n ) 



(14) 



(15) 



Finally, we can set 5 n = max(5i n , 5 2n , hn, #4n, £n) to complete the proof of the proposition. 

■ 

IV. Sub-graphs contained in typicality graphs 
In this section, we study the subgraphs contained in a sequence of typicality graphs. 

A. Subgraphs of general degree 

Definition 4: A sequence of typicality graphs G n (ei n , e 2n , A n ) is said to contain a se- 
quence of subgraphs T n of rates (Rx, Ry, R'x-> R'y) ^ ror each n, if there exists a sequence 
5 n — > such that 

1) The vertex sets of the subgraphs have sizes (denoted A^ and Ay) that satisfy 



-\ogA x -R x 

n 



< 8 n , 



-logA Y -R Y 

n 



< S n , Vn. 



2) The degree of each vertex x n in V x (T n ), denoted A n {x n ) satisfies 



logA' n (x n ) -R' Y 



3) The degree of each vertex y n in 

-\ogA' n (y n )-R' x 



of 



< S n , Wx n E V x (r n ), Vn. 
the V Y {T n ), denoted A' n (y n ) satisfies 

< 6 n , Vy n E V Y (T n ), Vn. 

The following proposition gives a characterization of the rate-tuple of a sequence ^ 
subgraphs in the sequence of typicality graphs of Pxy- 

Proposition 2: Let G n (ei n , e 2n , A n ) be a sequence of typicality graphs of Pxy- Define 

1Z = {(Rx , Ry , R'x > R'y) '■ G n (si n ,S2 n ,X n ) contains subgraphs of rates (Rx, Ry ■, R'x-> R'y)} 
Then 

Tl D {(R x ,Ry,R'x,R'y) ■ Rx < H(X\U),R Y < H(Y\U),R' X < H(Y\XU),R' Y < H(Y\XU) for 

(16) 
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Proof: 

Definition of T n . Consider any conditional distribution Pu\xy- This fixes the joint 
distribution Pxyu — PxyPu\xy- We construct T n as follows. 

• For each n, approximate the values of Puxy to rational numbers with denominator 
n to obtain pmf Puxy, respectively. Clearly Puxy is a valid joint type of length n 
and the maximum approximation error is bounded by -. Marginalizing the joint 
pmf, we also have Vx, y 

\P X Y(x,y) - P XY {x,y)\ <\U\--«-=< X n , (17) 

n \/n 



\Px(x) - P x (x)\ < \y\ .\U\--«^=< e ln (18) 

n \/n 



\P Y (y) - P Y (y)\ < \X\ ■ \U\ ■ - « 4= < £ 2n, (19) 

n Jn 



where the last inequality in each equation follows from the delta convention. Fur- 
ther Vw 

\Pu(u)-Pu(u)\<\y\-\X\.-. (20) 

n 

• Pick any length n sequence u n with type Pu, i.e., u n G Tq(Pu). Consider a bipartite 
graph T n with X-vertices consisting of all x n G TQ(P x \u\u n ), F-vertices consisting 
of all y n G TQ(pY\u\u n )- In other words, having fixed u n , the X-vertex sets and 
F-vertex sets consist of all length n sequences having conditional type Px\u and 
Py\u, respectively. Vertices x n G V x (T n ) and y n G V Y (T n ) are connected in T n iff 
(x n ,y n ) G TQ(P XY \u\u n ), i.e., if they have the conditional joint type P XY \u given u n . 
Let us verify that T n is a subgraph of G n . From Fact 1, if u n G Tq(Pu) and x n G 
T$(P X \u\u n ), then {x n ,u n ) G T£(Px,u)- Consequently, x n G Tg{P x ). Similarly, all y n G 
T™(P Yl u\u n ) belong to T n (P y ). On the same lines, if u n G T£{P V ) and (x n , y n ) G Tg(P XYlu \u r 
then (x n ,y n ,u n ) G T n (P Xi y iC/ ). This implies (x n ,y n ) G T r \P XyY ). Further, from (HZ]),© 
and ((191), we know 

TZ(P x )cT?jP x ) = V x (G n ), T-(P Y )cT- n (P Y ) = V Y (G n ) and 

n\P xy )cTl{P xy ). 

Hence for all sufficiently large n, T n is a subgraph of the typicality graph G n . 
Properties of T n . From [?, Lemma 2.3], we have 



1 \og\T^P xl u\u n )\-H(P xl u] 



n 



l \og\T^P nu \u n )\-H{P Y \u) 



n 



< 5 2n Vn, 

(21) 
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where 5 ln = (n+l)~ m ^ and S 2n = (n + l)~\ y W u \. Using flTS]), (O with (|20]>, we know that 
Px\u, Py\u are close to Px\u, Py\v> respectively. Using Fact 3, we know that the entropies 
H(P x \u)i H(P Y \u) must close to H(P X \ U ),H{P Y \ U ), respectively. Thus we can write ((IT) 
as (reusing 5 ln , 8 2n ) 



n 



\og\T^P xlu \u n )\-H(P xll 



< Sin, 



log\Tg(P Y{u \u n )\-H(P Y 



< 5. 



2n 



Vn, 



(22) 

Thus, the vertex sets of T n have rates R x = H(X\U) and R Y = H(Y\U), as required. 

Using Fact 1, for any x n G V x {T n ), every y n G T n (Py|x L /|a; r \ u n ) will satisfy a) y n ) G 
r "(Px,y|t/|M n ) and b) y n G ^"(Py^M™). Hence 



degree(x n ) > \T£(P Ylxu \x n , u n )\ > 2 n(H ( p w^ 5 ^\ 



(23) 



where 6 3n = \X\ \y\ \U\ 



log(n+l) 



We can also upper bound the degree of x n by noting that 



x n e T£(P xlu \u n ) and (x n ,y n ) G Tg(P XjYlu \u n ) implies y n G Tg(P Y{ 
Lemma 2.5], 

\TZ(P Y]xu \x n ,u n )\ < 2 ^W. 



From [?, 



Combining this with ((23]), we have 
1 



?? 



log A n (x n ) - H(P 



Y\XU, 



In a similar fashion, we can show that 



-\ogA n (y n )-H(P x{YU ] 
n 



<6 3n , Wx n G V x {T n ), Wn. 



< 6 in , \/y n G W(r„), Vn. 



(24) 



(25) 



Since the distributions P Y \ X u and Px\yu are close to Py|xi/ and P x \ Y jj, respectively, Fact 
3 enables us to replace H(P Y \ XU ), H(P X \ YU ) with H(P Y \ XU ), H(P X \ YU ), respectively in 
the two preceding equations. 

Taking 5 n = max(5 ln , 5 2n , S 3n , 5 4n ), we have shown the existence of a sequence of 
subgraphs T n with rates (H(X\U),H(Y\U),H(Y\XU),H(X\YU)). Since we can simply 
exclude edges from T n to obtain subgraphs with smaller rates, it is clear that all rate 
tuples characterized by 

(R X ,R Y ,R' X ,R Y ) : R x < H(X\U), R Y < H(Y\U), R' x < H(Y\XU), R' Y < H(Y\XU) 

are achievable for every conditional distribution Pu\xy- ■ 
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B. Nearly complete subgraphs 

A complete bipartite graph is one in which each vertex of the first set is connected 
with every vertex on the other set. We next consider a specific class of subgraphs, namely 
nearly complete subgraphs. For this class of subgraphs, we have a converse result that 
fully characterizes the set of nearly complete subgraphs present in any typicality graph. 

Definition 5: A sequence of typicality graphs G n (s\ n , e 2n , A„) is said to contain a se- 
quence of nearly complete subgraphs r n (ei n , e 2n , A n ) of rates (Rx,Ry) if for each n, if 
there exists a sequence 5 n — > such that 

1) The sizes of the vertex sets of the subgraphs, denoted A^ and Ay, satisfy 



-logA x -R x 
n 



< S r , 



-\ogA Y -R Y 



ni 

n 



< S n , Vn. 



2) The degree of each vertex x n in the X-set, denoted A n (x n ) satisfies 

- log A' n (x n ) >R Y - 6 n , Wx n e V x (T n ), Vn. 
n 

3) The degree of each vertex j in the K-set, denoted A'j 1 satisfies for all n 

- logA' n (y n ) >R X - S n , \fy n G V Y (T n ), Vn. 
n 

Proposition 3: Let G n {e\ n , e 2n , A n ) be a sequence of typicality graphs for Pxy- Define 

1Z = {(Rx,Ry) ■ G n (ei n , S2n, A n ) contains nearly complete subgraphs of rates (Rx-,Ry)} 

Then 
1) 

K 2 {(Rx, Ry) ■ Rx < H(X\U), R Y < H(Y\U) for some P V \ XY s.t. X-U -Y% 

(26) 

2) For all sequences of nearly complete subgraphs of G n such that the sequence 5 n 
(in Definition [5} converges to faster than 1/logn (more precisely, 5 n = o(j^) or 
linin^oo 5 n log n = 0), the rates of the subgraph (R X ,R Y ) satisfy 

Rx < H(X\U), R Y < H(Y\U) for some P u{X y s.t. X-U-Y 

Proof: The first part of the proposition follows directly from Proposition |2] by choos- 
ing Pu\xy such that X — U — Y form a Markov chain. We now prove the converse under 
the stated assumption that the sequence 5 n satisfies lim^oo 5 n log n = 0. 



4 X, U, Y form a Markov chain, in that order. 
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Suppose that a sequence of typicality graphs G n (ei n , e 2n , A n ) contains nearly complete 
subgraphs T n of rates R Xl Ry- The total number of edges in T n can be lower bounded 
as 



|Edges(r n )| > A^ ■ minimum degree of a vertex in Vx(T n ) 

(27) 



— x 



= A^ • Ay • 2~ 2n<5 " . 

Each of these edges represent a pair (x n , y n ) that is jointly A„-typical with respect to the 
distribution Pxy- In other words, each of these pairs (x n ,y n ) belongs to a joint type[?] 
that is 'close' to Pxy- Since the number of joint types of a pair of sequences of length n 
is at most (n + 1)1*11^1, the number of edges belonging to the dominant joint type, say 
Pxy satisfies 

A n A n 2 — 2n< ^ 1 

|Edges(r n ) having joint type P XY I > * n + y my \ • ( 28 ) 

Define a subgraph of r„ consisting only of the edges having joint type Pxy- A 
word about the notation used in the sequel: We will use i,j to index the vertices 
in V x {T n ),V Y {T n ), respectively Thus % e {1,...,A^} and j e {1,...,A Y }. The actual 
sequences corresponding to these vertices will be denoted x n (i),y n (j) etc. Using this 
notation, 

A n = : % G V x (T n ),j G V Y (T n ) s.t. (x n (i),y n (j)) has joint type P XY (29) 

From ((281), 

A x - A Y 2- 2n5n 

l^l - ( n + i)\x\\y\ (30) 

We will prove the converse result using a series of lemmas concerning A n . Some of the 
lemmas are similar to those required to prove in [?, Theorem 1]. We only sketch the 
proofs of such lemmas, referring the reader to [?] for details. 
Define random variables X' n , Y' n with pmf 

?r((X' n ,Y' n ) = (x n (i),y n (j)) = t^t, if (i, j) G A n . (31) 
Lemma 1: I(X' n ; Y' n ) < 2n5 n + \X\\y\ log(n + 1). 



Proof: Follow steps similar to the proof of [?, Lemma 1], using (|30|) to lower bound 
the size of An. 
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The next lemma is Ahlswede's version of the 'wringing' technique. Roughly speaking, 
if it is known that the mutual information between two random sequences is small, 
then the lemma gives an upper bound on the per-letter mutual information terms 
(conditioned on some values). 

Lemma 2: [?] Let A n , B n be RV's with values in A n , B n resp. and assume that 

I(A n -B n ) < a 

Then, for any < 5 < a there exist ti,t 2 , ...,t k e {1, ...,n} where < k < such that 
for some a tl , b tl , a t2 , b t , 2 , . . . , a tk , b tk 

I(A t ;B t \A tl = a tl ,B tl = b h ,...,A th = a tk ,B tk = b tk ) <5 for t = 1,2, ...,n (32) 

and 

Pr(A tl =a tl ,B tl = b tl ,...,A tk =a tk ,B tk =b t J > ( ——-—— ) k . (33) 

■ 

In our case, we will apply Lemma |2] to random variables X' n and Y' n . Lemma [T] 
indicates a = 2n5 n + \ X\ \y\ log(n + 1), and 8 shall be specified later. Hence we have that 
for some 

2a _ 2(n^+|^||y|log(n + l)) 
- T ~ 5 ' 

there exist x tl , y tl ,x t2 ,y t2 , x tk , y tk such that 

I(XW\X' ti =x tl X, = yt„-,X' tk = x tk X k =yt k ) <S for t = 1,2, ...,n. (34) 
We now define a subgraph of A n consisting of all edges (X' n , Y' n ) that have 

The subgraph denoted as A n is given by: § 

A\ 4 {(ij) e An : x' tl (i) = x tl ,YlU) = y tl ,..,x' tk (i) = x tk X k {j) = y tk -} (35) 



On the same lines as [?, Lemma 3], we have 

5 

\X\\y\(2a - <)) 

3 The heirarchy of subgraphs is G n D r n D A n D An 



\A n \ > ( , v „^,f ^) k \A n \. (36) 
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Define random variables X n , Y n on X n resp. y n by 

Pr((X n ,Y n ) = (x n {i),y n (j)) = -L-if G A n . (37) 

|*^M1 1 

If we denote X n = (X x , ...,X n ), Y n = (F 1; ...,Y n ), the Fano-distribution of the graph An 
induces a distribution y on the random variables X t , F 4 , t = 1, . . . , n. One can show 
that 

p(x t = x ,Y t = v) = p{K = xX = y\Kd) = ^KU) = = x tk X k (i) = ftj.w- 

(38) 

Using (f38)) in Lemma we get the bound I(X t ; Y t ) < 5. Applying Pinsker's inequality 
for I-divergences [?], we have 

\Pr(X t = x,Yt = y)- Pr(X t = x)Pr(Y t = y) \ < 25 1/2 , 1 < t < n. (39) 

Also define 

C{i) = : G An, 1 < j < Ay}. (40a) 

SO') = {(«, J) = (iJ) eA n ,l<i< A n x }. (40b) 

We are now ready to present the final lemma required to complete the proof of the 
converse. 
Lemma 3: 



n 

Rx < - Y j H(X t \Y t ) + 5 1 
t=i 

R Y <-Y.H{Y t \X t ) + 5 2 

t=i 

n 

Rx + Ry< -Y j H(X t Y t ) + +5: 



3ra 

t=l 



for some 5 2n , 5 3n — > and the distributions of the RV's are determined by the Fano- 
distribution on the codewords {(x n (i), y n (j)) : (i, j) G Ai}- 

Proof: We use a strong converse result for non-stationary discrete memoryless chan- 
nels, found in [?]. Consider a DMC with input A t and output B t (t = 1, . . . , n), with 
average error probability A (0 < A < 1). The result states that the size of the message 
set M is upper-bounded as 

n „ 

logM < Y,HA*\Bt) + T3^l^l^ /2 ' ( 41 ) 
t=i 
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where the distributions of the RV's are determined by the Fano-distribution on the 
codewords. 

We apply the above result to three noiseless DMCs (B t = A t , X = 0) as follows. Fix 
Y n = y n (j) for some j G A n and let the input be X t , t = 1, • ■ ■ ,n. Then, from (|4T|) we 
have 

n 

log|B(i)| <Y,H{X t \Y t = y t {j)) + ?>\X\n 112 . (42) 



t=i 



Similarly, 



log \C(i) | < H ^t\X t = x t (i)) + 3|^|n 1/2 , (43) 
t=i 

n 

log |X| < ^F(X t f t ) + 3|A'||y|n 1 / 2 . (44) 



t=i 



Noting that Pr(Y t = y) = \A\ 1 ^^-^^ ^-{y t (j),y}> we can sum both sides of (|42)> over all 
(z,j) G -4 n to obtain 

n 

\An\~ 1 Yl lo gl^')l < ^iJ(^tl^) + 3|A'|n 1 / 2 . (45) 



Define 



Then, 



5 * n (n + l)WI^I ( |A'||y|(2 ( T-5) ) ' (46) 



ixr 1 iog|B(j)i = w^Ei^iiogW)! 

>iAr x £ iB0')iiogW)i 

i:|B(j)|>B* 

j:|S(j)l>B* 

> lAI^Mfi^dAI-A^*). (47) 
Combining ((36), (130) and the definition of B* , we also have 

A"B* <-\A n \. (48) 
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Using this in (|47|) , we have 



\An\- 1 V log|B(i)| > \A n \- 1 \og(B*)(\A n \--\A n \) 

n 

2 2~ 2n<5 « A 



1 ' - n' MMP^f >' <49> 



Using (|45[) in the above we have 



log A™ < 



T (^if(A > i|Fi)+3|A'|n 1 / 2 ) + 2n5 n + logn+|A'||y|log(n+l) + A;log 



X — „ i v / , — \--i>\~ <>/ 1 ~ r - r - / 1 « 1 — o • - 1 r - iiw i — ov - 1 -/ 1 o\ - 

n-1 ^ 

(50) 

Analogously, 

n .A„.-.-. . . 1/2 . or , . VII -. M , ,s JA'p^a. 



log Ay < -^(^/J(r t |Xi) + 3|y|^ 1/2 ) + 2n5 n + logn+|A'||y|log(n + l) + A;log(J 

t=i 

(51) 

Next, we find an upper bound for log A^-A y . From (|36|) , we get 

5 



log | A | > log | Ail + Hog( 
> log | Ail + fclog( 



\x\\y\{2o-b\ 

5 



\x\\y\2*' 



= \og\A n \-k\og(^-)-k\og(\X\\y\) 

> log(A^A y ) - \X\\y\ Iog(n + 1) - 2n5 n - fclog( ^^ ), (52) 



where (a) is obtained by using ((30]> . Using (|44)) , the above inequality becomes 



log(A^A y ) < ^if(X t F t )+3|A'||y|n 1 / 2 +|A'||y|log(n + l) + 2n5 n + fclog(— ) + fclog(|^||y|) 



5 



(53) 
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Using the lower bounds on the sizes of Ax, Ay from [5l we can rewrite (|50|) , ((5T) and 
((531) as 

Rx _ Sn < £ W() + + 24. + ■og» + l^liy|io g(n + i) + * 

n — 1 n — 1 n no 

t=i 

(54) 

By _ 4. < 1 ± H[mt ) + 3 M ^ 4- 24„ + ^WI^D + * log(ffil) 

n — 1 n — 1 n no 

i=l 

(55) 

«v + - 24„ < i £ H(X t Y t ) + 3|*||y|-=^ + l^iwi^^ + 24, + * krf™^ 
n n — 1 n n o 

t=i 

(56) 

For our proof we would like all the terms on the right hand side of the above equations 
(except the entropies) to converge to as n — > oo. This will happen if 

-log( — ) ->• 0. 
n o 

Recall from Lemma [t] that a = 2n5 n + log(n + 1) and k < ^f. Hence we need to 

choose 8 such that 

4 log(^) ~ " , " (log(n<f„ + log n) - log 5) -> 0. (57) 
no o o 

,;From our assumption in the beginning, we have 5 n log n — > 0. Set 

<y=(<J n logn) 1 /2 (58) 

We see that asymptotically, (|57|) becomes 

/, " m/ 2 [log(n5 n + logn) - log(o^ /2 ) - log log n] (59) 
(logn) 1 /^ L J 

We separately consider each of the terms in the equation above 
1) If log(n5 n + logn) ~ log(n5 n ) for large n, then 

5 1/2 5 1/2 5 1/2 

log(nS n + log n) ~ - — ^— \og(n5 n ) = - — ^—-^ [log n + log 5 n ] 



(logn) 1 / 2 n ' (logn) 1 / 2 n (logn) 1 / 2 

1/2 ( 6 °) 

= (^logn) 1 / 2 + 6n l °l 6 ; -> 0, since * n -> 0. 
(logn) 1 ^ 
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If log(n<5„ + logn) ~ log(logn) for large n, then 



■log(n£ n + logn) ~ - — , log(logn) -> 0. (61) 



(logn) 1 / 2 (logn) 1 / 

,1/2 l/2\ 

2) ( l0 g^)i /2 log(<V ) —> because a; logx — > when x — > 0. 

Hence the term in ([59]) converges to as n — > oo, completing the proof of the lemma. ■ 
We can rewrite Lemma |3] using new variables X,Y,Q, where Q = t E {1,2, ...,n} with 
probability ^ and P X) Y\Q=t = Px t ,Y t - So we now have (for all sufficiently large n), 

R x <H(X\Y,Q) + 5 ln (62) 
R Y <H(Y\X,Q) + 5 2n (63) 
Rx + Ry <H(X,Y\Q) + 5 3n , (64) 

for some 5 ln , 5 2n , hn -> 0. 

Finally using (|39[) , we also have 

|Pr(X = x, Y = y\Q = t) - Pr(X = x\Q = t)Pr(Y = y\Q = t)\ 

= \Pr(X t = x,Y t = y)- Pr(X t = x)Pr{Y t = y)\ (65) 

< 25 1/2 = 2(5 n logn) 1/4 ->■ as n -> oo. 

In other words, for all t, X t ,Y t are almost independent for large n. Consequently using 
the continuity of mutual information with respect to the joint distribution, Lemma [3] 
holds with for any joint distribution PqP x \qPy\q such that the marginal on (X, Y) 
is P X y- Recall that P XY is the dominant joint type that is A„-close to P X y. Using 
suitable continuity arguments, we can now argue that Lemma [3] holds with for any 
joint distribution PqP x \qP y \q such that the marginal on (X, Y) is P X y, completing the 
proof of the converse. 



C. Nearly Empty Subgraphs 

So far, we have discussed properties of subgraphs of the typicality graph G n {e ln , e 2n , A„) 
such as the containment of nearly complete subgraphs and subgraphs of general degree. 
Now, we turn our attention to the presence of nearly empty subgraphs in the typicality 
graph. Our approach towards this problem differs slightly from the approach we took 
in Sections IIV-AI and IIV-BI While in these sections we characterized the subgraphs 
based on the degrees of their vertices, in this section we would characterize nearly 
empty subgraphs by the total number of edges present in such graphs. To effect this 
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characterization, we take a different approach than the one used in previous sections 
and analyze the probability that a randomly chosen subgraph of the typicality graph 
has far fewer edges than expected. In particular, we focus attention on the case when 
the random subgraph has no edges. 

Consider a pair (X, Y) of discrete memoryless stationary correlated sources with finite 
alphabets X and y respectively. Suppose we sample 2 nRl sequences from the typical set 
T™ (X) of X independently with replacement and similarly sample 2 nR ' z sequences from 
the typical set T™ 2n (Y) of Y. The underlying typicality graph G n (ex n , e 2n , A n ) induces a 
bipartite graph on these 2 nRl + 2 nR2 sequences. We provide a characterization of the 
probability that this graph is sparser than expected. This characterization is obtained 
through the use of a version of Suen's inequalities [?] and the Lovasz local lemma [?] 
listed below. 

Lemma 4: [?] Let Jj G Be(pi),i G X be a family of Bernoulli random variables. Their 
dependency graph L is formed in the following manner. Denote the random variable 7, 
by a vertex i and join vertices i and j by an edge if the corresponding random variables 
are dependent. Let X = ^E(Jj) and T = E(X) = Y^iVi- Moreover, write i ~ j if (i, j) is 
an edge in the dependency graph L and let 6 = \ J2i 2~2j^i^(hLj) and 9 = maxj J2j~iPj- 
Then, Suen's inequalities state that for any < a < 1, 

P(X < aV) < exp {-min ((1 - «) 2 ^^ , (1 - a)^j } (66) 
Putting a = 0, this can be further tightened to 

p <*= o > (67) 

Lemma 5: [?] Let L be the dependency graph for events ei,...,e n in a probability 
space and let E(L) be the edge set of L. Suppose there exists Xi G [0, 1], 1 < i < n such 
that 

P(si)<Xi 11 {1-xj). (68) 

(i,j)eE(L) 



n 



Then, we have 

P(n7 =1 £-)>n(i-^)- (69) 

1=1 

Another version of the local lemma is as given below. Let 0(x),O < x < e -1 be the 
smallest root of the equation <p(x) = e x ^ x \ With definitions of T and 9 as in Lemma |4] 
and defining r = maxj P{ei), we have 

P(nr =1 5-)>exp{-r^ + r)} (70) 
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With these preliminaries, we are ready to state the main result of this section. 

Proposition 4: Suppose X and Y are correlated finite alphabet memoryless random 
variables with joint distribution p(x, y). Let £i„, e 2n , A n satisfy the 'delta convention' and 
Ri, R 2 be any positive real numbers such that Ri + R 2 > I(X; Y). Let C x be a collection 
of 2 nRl sequences picked independently and with replacement from T™ (X) and let Cy 
be defined similarly. Let U be the cardinality of the set 

U ± {(x n ,y n ) eC x x Cy. (x n ,y n ) e Tl(X,Y)} (71) 

Assume, without loss of generality that R\> R 2 . Then, for any 7 > 0, we have 



lim — log log 

moo n 



> 



_ R1 + R2- I(X; Y) - 7 if R x < I(X; Y) 

nV) J I ~) R2-I MR X >I{X-,Y) 

(72) 

Setting 7 = in the above equation gives us 

lim - log log 1 > min (R 2 , R 1 + R 2 - I(X; Y)) (73) 
n P(U = 0) 

This inequality holds with equality when R 2 < R% < I(X;Y). 

Proof: Let X n (i) and Y n (j) denote the ith and jth codewords in the random code- 
books Cx and Cy respectively. For 1 < i < 2 nRl and 1 < j < 2 nR2 , define the indicator 
random variables 

tJ j else 
The cardinality of the set hi is then 

1=1 i=i 

We derive upper bounds on the probability of the lower tail of U using Suen's inequality. 
To do this, we first set up the dependency graph of the indicator random variables Uij. 
The vertex set of the graph is indexed by the ordered pair 1 < i < 2 nRl , 1 < j < 
2 nR2 . From the nature of the random experiment, it is clear that the indicator random 
variables £4, and U^y are independent if and only if % ^ i' and j 7^ j'. Thus, each 
vertex is connected to exactly 2 nRl + 2 nR - 2 — 2 vertices of the form (i,j'),f 7^ j or 
7^ i. If vertices and (k,l) are connected, we denote it by ~ (k,l). 
In order to estimate T, 6 and 9 as defined in Lemma |H define the following quantities. 
Let otij = F(Uij = 1) and /%}{«} = E(U i:j U k i) where (ij) ~ (k,l). Using Facts 1 and 2, 
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uniform bounds can be derived for these quantities as 



a 



2-»CW)+*3n) < a < 2 -»(W)-s3») A Q ' (76) 



where £ 3n is a continuous positive function of £i n ,£2n and A n that goes to as n — > oo. 
Similarly, a uniform bound on /3{iO{fci} can be derived as 

2 -2n(/(X;y)+2 £4 „) < ^..^ < 2 -2n(I(X;Y)-2 £4 „) A £ (77) 

where e 4n is a continuous positive function of e\ n ,e 2n and A n that goes to as n — > oo. 
The quantities involved in Suen's inequality can now be estimated. 

T 4 E(C/) > 2 n{Rl+R2) a (78) 

~ ~ Yl Yl E ( U v U m) < ^2 n(Rl+R2 \2 nRl + 2 nR2 -2)f3 (79) 

(fc,i)~(ij) 

= max V E(17 W ) < (2 nKl + 2 nfla - 2)a' (80) 

(fc,0~(ij) 

Substituting these bounds into equations (|67) and (|66l> proves the claims made in equa- 
tions (|72|) and ([73|> of Proposition HI 

A lower bound to the probability of the induced random subgraph being empty can 
be derived by employing the Lovasz local lemma on the 2 Tl( < Rl+R2 ' > events {[/^ = 1}, 1 < 
i < 2 nRl , 1 < j < 2 nR2 . Symmetry considerations imply that all Xi can be set identically 
to x in Lemma |5j Then the local lemma states that if there exists x E [0, 1] such that 
a < p(u id = 1) < x{l - x )(2^+2«« 2 - 2)/ then p{ jj = o) > (i _ x) 2n(Rl+R2) . It is easy to 

verify that for such an x to exist, we need R 2 < Ri < I(X;Y) and if so, x = 2~ nRl 
satisfies the condition. Therefore, we have 

P(U = 0) > exp (- (2 nR2 + 1)) R 2 < Rx < I(X; Y) (81) 

We can derive a similar bound using the second version of the local lemma given in 
Lemma [5] While Y and 9 are same as estimated earlier, r = max^) P(Uij = 1) is upper 
bounded by a as defined in equation (|76|) . Hence, 

P(U = 0) > exp (-r<j)(e + r)) . (82) 

Under the same assumption R 2 < R x < I(X; Y), 6 + r < (2 ni?1 + 2 nR2 - 2) a ^ as 
n — > oo and hence 0(0 + r) — > 1. Combining equations (|8~T) and (|82~1> , taking logarithms 
and letting n — )■ oo, we get 

lim - log log ) < min (i? 2 , R x + R 2 - I(X; Y)) . (83) 
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Comparing this to equation <[73]> shows that this expression is asymptotically tight in 
the regime R 2 < R x < I(X; Y). ■ 
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